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Abstract. We compare tools for complementing nondeterministic Biichi automata with 
a recent termination-analysis algorithm. Complementation of Biichi automata is a key step 
in program verification. Early constructions using a Ramsey-based argument have been 
supplanted by rank-based constructions with exponentially better bounds. In 2001 Lee 
et al. presented the size-change termination (SCT) problem, along with both a reduction 
to Biichi automata and a Ramsey-based algorithm. The Ramsey-based algorithm was 
presented as a more practical alternative to the automata-theoretic approach, but strongly 
resembles the initial complementation constructions for Biichi automata. 

We prove that the SCT algorithm is a specialized realization of the Ramsey-based 
complementation construction. To do so, we extend the Ramsey-based complementation 
construction to provide a containment-testing algorithm. Surprisingly, empirical analysis 
suggests that despite the massive gap in worst-case complexity, Ramsey-based approaches 
are superior over the domain of SCT problems. Upon further analysis we discover an 
interesting property of the problem space that both explains this result and provides a 
chance to improve rank-based tools. With these improvements, we show that theoretical 
gains in efficiency of the rank-based approach are mirrored in empirical performance. 



The automata-theoretic approach to formal program verification reduces questions about 
program adherence to a specification to questions about language containment. Represent- 
ing liveness, fairness, or termination properties requires finite automata that operate on 
infinite words. One automaton, A, encodes the behavior of the program, while another 
automaton, B, encodes the formal specification. To ensure adherence, verify that the in- 
tersection of A with the complement of B is empty. Finite automata on infinite words are 
classified by their acceptance condition and transition structure. We consider here nonde- 
terministic Biichi automata, in which a run is accepting when it visits at least one accepting 

1998 ACM Subject Classification: D.2.4. 

Key words and phrases: Biichi Complementation, Model Checking, Formal Verification, Automata, Biichi 
Automata. 
* Earlier version appeared in TACAS09. 

° Work supported in part by NSF grants CCR-0124077, CCR-0311326, CCF-0613889, ANI-0216467, and 
CCF-0728882, by BSF grant 9800096, and by a gift from Intel. 



SETH FOGARTY" AND MOSHE Y. VARDI*" 



1. Introduction 




□01:10.21 68/LMCS-8 (1:13) 2012 



© S. Fogarty and M. Y. Vardi 
© tCreative Commons 



2 



S. FOGARTY AND M. Y. VARDI 



state infinitely often. For these automata, tlie complementation problem is known to in- 
volve an exponential blowup [25j. Thus the most difficult step in checking containment is 
constructing the complementary automata B. 

The first complementation constructions for nondeterministic Biichi automata employed 
a Ramsey-based combinatorial argument to partition the set of all infinite words into a finite 
set of omega-regular languages. Proposed by Biichi in 1962 [5], this construction was shown 
in 1987 by Sistla, Vardi, and Wolper to be implementable with a blow-up of 2*^^" ^ [29] . 
This brought the complementation problem into singly-exponential blow-up, but left a gap 
with the 2'^("^°s«) lower bound proved by Michel f25|. 

The gap was tightened in 1988, when Safra described a 2*^^"''°^") construction [26]. 
Work since then has focused on improving the practicality of 2'^("^°s") constructions, either 
by providing simpler constructions, further tightening the bound [27], or improving the 
derived algorithms. In 2001, Kupferman and Vardi employed a rank-based analysis of Biichi 
automata to simplify complementation [23] . Recently, Doyen and Raskin have demonstrated 
the utility of using a subsumption technique in the rank-based approach, providing a direct 
universality checker that scales to automata several orders of magnitude larger than previous 
tools HQ]. 

Separately, in the context of program termination analysis, Lee, Jones, and Ben-Amram 
presented the size-change termination (SCT) principle in 2001 [24j. This principle states 
that, for domains with well-founded values, if every infinite computation contains an infin- 
itely decreasing value sequence, then no infinite computation is possible. Lee et al. describe 
a method of size-change termination analysis and reduce this problem to the containment 
of two Biichi automata. Stating the lack of efficient Biichi containment solvers, they also 
propose a Ramsey-based combinatorial solution that captures all possible call sequences in 
a finite set of graphs. The Lee, Jones, and Ben-Amram (LJB) algorithm was provided as a 
practical alternative to reducing the verification problem to Biichi containment, but bears 
a striking resemblance to the 1987 Ramsey-based complementation construction |29j . 

In this paper we show that the LJB algorithm for deciding SCT |24j is a specialized 
implementation of the 1987 Ramsey-based complementation construction [29]. Section [2] 
presents the background and notation for the paper. Section [3] expands the Ramsey-based 
complementation construction into a containment algorithm, and then presents the proof 
that the LJB algorithm is a specialized realization of this Ramsey-based containment al- 
gorithm. In Section [4j we empirically explore Lee et al.'s intuition that Ramsey-based 
algorithms are more practical than Biichi complementation tools on SCT problems. Ini- 
tial experimentation does suggest that Ramsey-based tools are superior on SCT problems. 
This is surprising, as the worst-case complexity of the LJB algorithm is significantly worse 
than that of rank-based tools. Investigating this discovery in Section [5} we note that it 
is natural for SCT problems to be reverse-deterministic, and that for reverse-deterministic 
problems the worst-case bound for Ramsey-based algorithms matches that of the rank- 
based approach. This suggests improving the rank-based approach in the face of reverse 
determinism. Indeed, we find that reverse-deterministic SCT problems have a maximum 
rank of 2, collapsing the complexity of rank-based complementation to 2^^"'\ Revisiting 
our experiments, we discover that with this improvement rank-based tools are superior on 
the domain of SCT problems. To further explore the phenomena, we generate a set of non- 
reverse-deterministic SCT problems from monotonicity constraint systems, a more complex 
termination problem. We conclude with a discussion in Section |6] 
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2. Preliminaries 

In this section we review the relevant details of the Biichi complementation and size-change 
termination, introducing along the way the notation used throughout this paper. A nonde- 
terministic Biichi automaton on infinite words is a tuple B = (S, Q, Q*", p, F), where S is a 
finite nonempty alphabet, Q a finite nonempty set of states, Q*" C Q a set of initial states, 
F C Q a set of accepting states, and p : Q x S — )• 2*^ a nondeterministic transition function. 
We lift the p function to sets of states and words of arbitrary length as follows. Given a 
set of states R, define p{R,a) to be Ugej? '^)- Inductively, given a set of states R: let 
p{R,e) = R, and for every word w = (jQ...an let p{R,w) be defined as p{p{R,ao),ai...an)- 

A run of a Biichi automaton ;S on a word w G T,'^ is a infinite sequence r = qoQi... G 
such that go £ Q*" and, for every i > 0, we have Qj+i S p{qi,Wi). A run is accepting iEqi £ F 
for infinitely many i £ IN. A word w G S"^ is accepted by B if there is an accepting run of B 
on w. The words accepted by B form the language of B, denoted by L{B). Correspondingly, 
a path in B from g to r on a word w £ S"^ is a finite sequence r = qo---qn G such that 
qo = q, qn = r, and, for every z G {0...n — 1}, we have qi+i £ p{qi, Wi). A path is accepting 
if some state in the path is in F. 

Example 2.1. An example automaton is shown in Figure [l| which accepts words with a 
finite, but non-zero, number of a's. The automaton waits in q and guesses when it has seen 
the last a, transitioning on that a to r. If the automaton moves to r prematurely, it can 
transition to s before it encounters any remaining o's to continue the run. From s, it can 
guess once again when it has seen the last a, transitioning this time to t. 

A Biichi automaton A is contained in a Biichi automaton B iff L{A) C L{B), which can 
be checked by verifying that the intersection of A with the complement i3 of is empty: 
L{A) n L{B) = 0. We know that the language of an automaton is non-empty iff there 
are states q £ Q*"", r £ F such that there is a path from g to r and an accepting path 
from r to itself. The initial path is called the prefix, and the combination of the prefix 
and cycle is called a lasso [31j- Furthermore, the intersection of two automata can be 
constructed, having a number of states proportional to the product of the number states of 
the original automata [6]. Thus, the most computationally demanding step is constructing 
the complement of B. In the formal verification field, existing empirical work has focused 
on the simplest form of containment testing, universality testing, where A is the universal 
automaton [9l [30] . 
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Figure 2: Three graphs in Qe for the automaton of Figure [ij From left to right, the graph 
describing the word a, the graph describing the word b, and the graph describing the word ab. 

2.1. Ramsey-Based Universality. When Biichi introduced these automata in 1962, he 
described a complementation construction involving a Ramsey-based combinatorial argu- 
ment |5]. We describe an optimized implementation presented in 1987 [29]. To construct 
the complement of B = (S, Q, Q*", p, F), where Q = {qo, qn-i}, we construct a set 
whose elements capture the essential behavior of B. Each element corresponds to an answer 
to the following question: 

Given a finite nonempty word w, for every two states q,r E Q: 

(1) Is there a path in B from q to r over w? 

(2) If so, is some such path accepting? 

Define Q' = Q x {0,1} x Q, and to be the subset of 2*^' whose elements, for every 
q,r £ Q, do not contain both {q, 0, r) and {q, l,r). Each element of Q13 is a {0, l}-arc-labeled 
graph on Q. An arc represents a path in B, and the label is 1 if the path is accepting. Note 
that there are such graphs. With each graph g E we associate a language L{g)., the 
set of words for which the answer to the posed question is the graph encoded by 5. 

Definition 2.2. Let Tj E Qb and w E S^. Then w E ^(5) iff, for all pairs of states q,r £ Q: 

(1) (g, a,r) a E {0, 1}, iff there is a path in B from q to r over w. 

(2) {q, 1, r) £g iff there is an accepting path in B from g to r over w. 

Example 2.3. Three graphs from Qjg are shown in Figure [l] All graphs have a non-empty 
language. The word a is in the language of the first graph, the word b is in the language of 
the second graph, and the word ab is in the language of the third graph. 

Lemma 2.4. [51 [29] 

(1) {L(g) I g E Qb} is a partition of S+ 

(2) Ifu£ L{g), V E L(h), and uv E L(A-), then L(g) ■ L(h) C L(k) 

The languages L(g), for the graphs g E Qb, form a partition of S"*". With this partition 
of S+ we can devise a finite family of w-languages that cover S'^. For every g, h € Qb, let 
YQ, h) be the w-language L(g) ■ L{hY . Say that a language ^(g, h) is proper if /i) is 
non-empty, L(^) • L{h) C L(^), and • L{h) C L(/i). There are a finite, if exponential, 
number of such languages. A Ramsey-based argument shows that every infinite string 
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belongs to a language of this form, and that L{B) can be expressed as the union of languages 
of this form. 

Lemma 2.5. [5l|29] S"^ = \J{Y{g,h) \ Y{g,h) is proper} 

Proof: The proof is based on Ramsey's Theorem. Consider an infinite word w = aocri... 
By Lemma 2.4, every prefix of the word w is in the language of a unique graph gi. Let 

2 

A; = 3" be the number of graphs. Thus w defines a partition of IN into k sets Di, 
such that i G Di iS c7o...o"j_i G LiTji). Clearly there is some m such that Dm is infinite. 

Similarly, by Lemma |2.4| we can use the word w to define a partition of all pairs of 
elements (i, j) from Dm, where i < j. This partition consists of k sets Ci, ...C^, such that 
{i,j) G Ci iff Ui-.-Uj-i G L{gi). Ramsey's Theorem tells us that, given such a partition, 
there exists an infinite subset {ii,i2, •••} of Dm and a Cn such that for all pairs of distinct 
elements ij,ik, it holds that {ij,ik) G Cn- 

This implies that the word w can be partitioned into 

= (To...crji-i, u'2 = crH...crj2_i, W3 = ai^.-.ai^-i, 

where wi G L(gm) and Wi G L(gn) for i > 1. By construction, ao...ai^-i £ L(gm) for every 
ij, and thus we have that W1W2 G Lijjm)- In addition, as o-j^. ...(Tj^_i G L{gn) for every pair 
we have that W2W3 G Lijjn)- By Lemma 2.4, it follows that Lijjm) ■ L{gn) ^ L(gr) 



and that L(5r„) • L(5f„) C L{gn), and thus Y{gm79n) is proper. □ 

Furthermore, each proper language is entirely contained or entirely disjoint from L{B). 
This provides a way to construct the complement of L{B): take the union every proper 
language that is disjoint from L{B). 

Lemma 2.6. [3 EH] 

(1) For g,he Qb, either Y{gji) n L{B) = or Yq,h) C L{B). 

(2) L{B) = {]{Y{g, h) I Y{g, h) is proper and Y{g, h) n L{B) = 0} 

To obtain the complementary Biichi automaton B, Sistla et al. construct, for each 
g G Qi3, a deterministic automata on finite words, Bg^ that accepts exactly L(g). Using the 
automata Bg, one can then construct the complementary automaton B j29]. We can then 
use a lasso-finding algorithm on B to prove the emptiness of B, and thus the universality 
of B. However, we can avoid an explicit lasso search by employing the rich structure of the 
graphs in Qg. For every two graphs g,h £ Qs, determine if Y(g^ h) is proper. If Y(g^ h) is 
proper, test if it is contained in L{B) by looking for a lasso with a prefix in Ij and a cycle 
in h. B is universal if every proper Yijj, h) is so contained. 

Lemma 2.7. |29j Given an Biichi automaton B and the set of graphs Qs, 

(1) B is universal iff for every proper Y{g, h), it holds that Y{g, h) C L{B). 

(2) Let g,h € Qq he two graphs where Y(g^ h) is proper. Y{(j, h) C L{B) iff there exists 
q G Q*", r £ Q, a G {0, 1} where {q, a,r) £g and (r, 1, r) G h. 



Lemma 2.7 yields a PSPACE algorithm to determine universality [29]. Simply check 
each g,h £ Qq. If Y{g,h) is both proper and not contained in L{B), then the pair (g,h) 
provide a counterexample to the universality of B. If no such pair exists, the automaton 
must be universal. 
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2.2. Rank-Based Complementation. While our focus is mainly on the Ramsey-based 
approach, in Section [5] we look at the rank-based construction described here. If a Biichi au- 
tomaton B does not accept a word w, then every run ofBonw must eventually cease visiting 
accepting states. The rank-based construction, foreshadowed in [20] and first introduced in 
|21j . uses a notion of ranks to track the progress of each possible run towards this point. 
Consider a Biichi automaton B = {Ti,Q,Q,z, p,Qf) and an infinite word w = loqloi.... The 
runs of B on w can be arranged in an infinite DAG (directed acyclic graph), = {y,E), 
where 

• ^ Q X IN is such that {q, /) G ^ iff some run r of B on w has r{l) = q. 

• E C Ui>o(Q X {/}) X (Q X {/ + 1}) is Eiiq, I), {q', I + 1)) iff {q, I) G V and q' G ui). 

Gw, called the run DAG oi B on w, exactly embodies all possible runs of 5 on 
We define a run DAG Gw to be accepting when there exists a path in Gw with infinitely 
many states in F. This path corresponds to an accepting run of B on w. When Gw is not 
accepting, we say it is a rejecting run DAG. Say that a node {q, i) of a graph is finite if it 
has only finitely many descendants, that it is accepting if q G Qf, and that it is F-free if it 
is not accepting and does not have accepting descendants. 

Given a run DAG Gw, we inductively define a sequence of subgraphs by eliminating 
nodes that cannot be part of accepting runs. A node that is finite can clearly not be part 
of an infinite run, much less an accepting infinite run. Similarly, a node that is F-free may 
be part of an infinite run, but this infinite run can not visit an infinite number accepting 
states. 

• GwiO) = Gw 

• Gwi2i + 1) = Gw{2i) \ {{q, I) I {q, I) is finite in Gw{2i)} 

• Gw{2i + 2) = Gw{2i + 1) \ {{q,l) I {q,l) is F-free in Gw{2i)} 

If the final graph a node appears in is Gw{i), we say that node is of rank i. If the rank 
of every node is at most i, we say Gw has a rank of i. B has a maximum rank of i when 
every rejecting run DAG of B has a maximum rank of i. Note that nodes with odd ranks 
are removed because they are F-free. Therefore no accepting state can have an odd rank. 
Kupferman and Vardi prove that the maximum rank of a rejecting run DAG for every 
automaton is bounded by 2\Q\ — 2 [21'. This allows us to create an automaton that guesses 
the ranking of rejecting run DAG of B as it proceeds along the word. 

A level ranking for an automaton B with n states is a function f : Q ^ {0...2n — 2, _L}, 
such that if g G F then f{q) is even or _L. Let a be a letter in S and /, /' be two level 
rankings /. Say that / covers f under a when for all q and every q' G p{q, a), if f{q) ^ _L 
then f'{q') ^ -L and f'{q') < f{q)', i.e. no transition between / and /' on a increases in 
rank. Let Fr be the set of all level rankings. 

Definition 2.8. If B = {T,,Q,Q^^, p,F) is a Biichi automaton, define KV{B) to be the 
automaton {T.,Fr x 2'^, 0), F^ x {0}), where 

• fin{q) = 2n — 2 for each q G Q*"", -L otherwise. 

• Define p' : {Fr x 2^) x a ^ 2<^' ><2'^) to be 

- If oy^0thenp'((/,o),(j) = 

{(/',o' \d) \ f covers /' under a, d = p{o,a), d = {q\ f{q) odd}}. 

- If = then p'((/,o>,ct) = 

{(/'jO'} I / covers /' under a, o' = {q\ f'{q) even}}. 



Lemma 2.9. [22j For every Biichi automaton B, L{KV {B)) = L{B). 
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This automaton tracks the progress of B along a word w = aocri... by attempting to 
find an infinite series /o/i--- of level rankings. We start with the most general possible 
level ranking, and ensure that every rank /j covers /j+i under cij. Every run has a non- 
increasing rank, and so must eventually become trapped in some rank. To accept a word, the 
automaton requires that each run visit an odd rank infinitely often. Recall that accepting 
states cannot be assigned an odd rank. Thus, for a word rejected by B, every run can 
eventually become trapped in an odd rank. Conversely, if there is an accepting run that 
visits an accepting node infinitely often, that run cannot visit an odd rank infinitely often 
and the complementary automaton rejects it. 

An algorithm seeking to refute the universality of B can look for a lasso in the state-space 
of the rank-based complement of ;S. A classical approach is Emerson-Lei backward-traversal 
nested fixpoint vY.fj.X.^Pre^X) U {Pre{Y) n F)) [11]. This nested fixpoint employs the 
observation that a state in a lasso can reach an arbitrary number of accepting states. The 
outer fixpoint iteratively computes sets Yq,Yi, ... such that Yi contains all states with a path 
visiting i accepting states. Universality is checked by testing if Y^o, the set of all states with 
a path visiting arbitrarily many accepting states, intersects Q*"'. The strongest algorithm 
implementing this approach, from Doyen and Raskin, takes advantage of the presence of 
a subsumption relation in the rank-based construction: one state (/, o) subsumes another 
(/',o'> iff: f'{x) < f{x) for every x G Q; o' o; and o = iff o' = 0. When computing 
sets in the Emerson-Lei approach, it is sufficient to store only the maximal elements under 
this relation. Furthermore, the predecessor operation for a single state and letter results 
in at most two incomparable elements. This algorithm has scaled to automata an order of 
magnitude larger than other approaches [S]. 

2.3. Size-Change Termination. In ^24j Lee et al. proposed the size-change termination 
(SCT) principle for programs: "If every infinite computation would give rise to an infinitely 
decreasing value sequence, then no infinite computation is possible." The original presen- 
tation concerned a first-order pure functional language, where every infinite computation 
arises from an infinite call sequence and values are always passed through a sequence of 
parameters. 

Proving that a program is size-change terminating is done in two phases. The first 
extracts from a program a set of size-change graphs, G, containing guarantees about the 
relative size of values at each function call site. The second phase, and the phase we 
focus on, analyzes these graphs to determine if every infinite call sequence has a value that 
descends infinitely along a well-ordered set. For an excellent discussion of the abstraction 
of functional language semantics, refer to |19i. We consider here a set H of functions, and 
denote the parameters of a function / by P{f). 

Definition 2.10. A size-change graph (SCG) from function /i to function /2, written 
G : /i — )• /2, is a bipartite {0, l}-arc-labeled graph from the parameters of fi to the 

parameters of /2, where G C x {0, 1} x P{f2) does not contain both x \ y and 



x^y. 

Size-change graphs capture information about a function call. An arc x -V y indicates 
that the value of x in the function /i is strictly greater than the value passed as y to function 

/2. An arc x -% y indicates that x's value is greater than or equal to the value given to 
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Figure 3: Size-Change Graphs: A size-change problem with two functions, / and g, and three call 
sites: a call a to g occurring in the body of /, and two recursive calls, b and c, from g to itself. 



a:f^g b : g-^g c:g-^g b:g-*g c:g-*g 



z 




z4>z 




y-..i y 
z X 




y c.:fy 

z4>z 




y-%1 y 
z X 



Figure 4: The dotted line forms a prefix of a late-start thread in the call sequence abcbcbc • • ■ . 



y. We assume that all call sites in a program are reachable from the entry points of the 
progranQ 

A size-change termination (SCT) problem is a tuple L = {H, P,C,G), where H is a 
set of functions, P a mapping from each function to its parameters, C a set of call sites 
between these functions, and Q a set of SCGs for C. A call site is written c : /i — >• /2 
for a call to function /2 occurring in the body of /i. The size-change graph for a call site 
c : /i — )• /2 is written as Gc- Given a SCT problem L, a call sequence in L is a infinite 
sequence cs = cq, ci, . . . G C"^, such that there exists a sequence of functions /o, /i, . . . where 
Co : /o — ^ /i 5 ci : /i — >• /2 . . . . A thread in a call sequence cq , ci , . . . is a connected sequence of 

arcs, X y,y \ z, . . ., beginning in some call Cj such that x A y S ,y z ^ Cc^+i i • • •• 
We say that L is size- change terminating if every call sequence contains a thread with 
infinitely many 1-labeled arcs. Note that a thread need not begin at the start of a call 
sequence. A sequence must terminate if a well-founded value decreases infinitely often, 
regardless of when this decrease begins. Therefore threads can begin in arbitrary function 
calls, in arbitrary parameters. We call this the late-start property of SCT problems We 



revisit this property in Section 3.3 



Example 2.11. Three size-change graphs, which will provide a running example for this 
paper, are presented in Figure [3j The represented problem is size-change terminating. The 
call sequence abcbc ... is displayed in Figure |4j where a thread of infinite descent exists, 
starting in the second graph. This late-start thread proves the sequence terminating. 

Every call sequence can be represented as a word in C"^, and a SCT problem re- 
duced to the containment of two w-languages. The first language Flow{L) = {cs G 

I cs is a call sequence}, contains all call sequences. The second language, Desc{L) = 
{cs £ Flow{L) I some thread in cs has infinitely many 1-labeled arcs}, contains only call 



^The implementation provided by Lee et al. |24: also make this assumption, and in the presence of 
unreachable functions size-change termination may be undetectable. 
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Figure 5: Afiow{L) a,nd ADesc{L)- 
problem of Figure [3] 



the automata resulting from applying Definition 2.12 to the SCT 



sequences that guarantee termination. A SCT problem L is size-change terminating if and 
only if Flow{L) C Desc{L). 

Lee et al. [21] describe two Biichi automata, Apiou){L) -^Desc{L)i that accept these 
languages. ^Fio«;(L) is simply the call graph of the program. Aj;)esc{L) waits in a copy of 
the call graph and nondeterministically chooses the beginning point of a descending thread. 
From there it ensures that a 1-labeled arc is taken infinitely often. To do so, it keeps two 
copies of each parameter, and transitions to the accepting copy only on a 1-labeled arc. Lee 
et al. prove that L{Apiow{L)) = Flow{L), and i(^Desc(L)) = Desc{L). The automata for 
our running example are provided in Figure [5j 

Definition 2.12.0 

■^Fiow{L) = {C,H,H,pF,H), where 
• PF{fi,c) = {/2 I c : /i /2} 



A 



Desc{L) 



(C, Qp UH,QpU H, PD,F), where 



. Qp = {{x,r) \feH,xe P(/), r G {1,0}}, 

• PD{fi,c) = {/2 I c : /i ^ /2} U {(x,0) I c : /i 

r' 

• pD{{x,r),c) = {{x',r') \ X e Qc}, 

• F = {{x,i) \feH, xePif)} 



/2, X G P(/2)} 



Using the complementation constructions of either Section 2.1 or 2.2 and a lasso- finding 



algorithm, we can determine the containment of Afiow{l) iii -^Desc^L)- Lee et al. propose an 
alternative graph-theoretic algorithm, employing SCGs to encode descent information about 
entire call sequences. A notion of composition is used, where a call sequence co...c„_i has a 
thread from x to y if and only if the composition of the SCGs for each call, Gcq] Gc„_^-, 
contains the arc x y. The closure of Q under the composition operation, called S*, is 



The original LJB construction 24J restricted starting states in ADusciL)^^ functions. This was changed 



to simphfy Section 3.4 The modification does not change the accepted language. 
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Figure 6: The composition of tlie SCGs for b and c, from Figurejsj The resulting size-change graph 
is idempotent, contains the arc y y, and describes the call sequence of Figure |4| 



then searched for a counterexample describing an infinite call sequence with no infinitely 
descending thread. 

Definition 2.13. Let G : /i — ^ /2 and G' : /2 — > /s be two SCGs. Their composition G; G' 
is defined as G" ■ fi ^ fs where: 

G" = {x\z\xA-y€G, yAz€G',y£ P{f2), a = 1 or 6 = 1} 

U {x-%z\x-%yeG, y-%zeG',y£ P{f2), and 

for all y' , a,b if x A- y' £ G and y' \ z £ G' then a = b = 0} 

Using composition, we can focus on a subset of graphs. Say that a graph G : f ^ f 
is idempotent when G = G;G. Each idempotent graph describes a cycle in the call graph, 
and a Ramsey-based argument shows that each cycle in the call graph can be accounted 
for by at least one idempotent graph. The composition of two graphs is shown in Figure |6j 
which describes the call sequence in Figure |4| 

Algorithm LJB searches for a counterexample to size-change termination. First, it 
iteratively build the closure set S: initialize 5 as ^; and for every G : /i — ?• /2 and 
G' : /2 — )• /a in S, include the composition G; G' in 5. Second, the algorithm check every 
G : /i — )• /i G S" to ensure that if G is idempotent, then G has an associated thread with 

infinitely many 1-labeled arcs. This thread is represented by an arc of the form x x. There 
are pathological SCT problems for which the complexity of Algorithm LJB is 2*^(("/^'' 

The next theorem, whose proof uses a Ramsey-based argument, demonstrates the cor- 
rectness of Algorithm LJB in determining the size-change termination of an SCT problem 
L = {H,P,G,g). 

Theorem 2.1. [24j A SCT problem L = {H, P,C,g) is not size-change terminating iff S, 
the closure of Q under composition, contains an idempotent SCG graph G : f ^ f that 
does not contain an arc of the form x — t- x. 



3. Size-Change Termination and Ramsey-Based Containment 



The Ramsey-based test of Section 2.1 and the LJB algorithm of Section 2.3 bear a remarkable 
similarity. In this section we bridge the gap between the Ramsey-based universality test and 
the LJB algorithm, by demonstrating that the LJB algorithm is a specialized realization of the 
Ramsey-based containment test. This first requires developing a Ramsey-based framework 
for Biichi containment testing. 
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Algorithm 1: LJBi{H, P,C,G)) 

Data: A size-change termination problem {H, P, C, Q). 
Result: Whether or not the problem is size-change terminating. 
Initialize S> Q 
repeat 

for all pairs G : f ^ g, G' : g ^ h in S do 
G" : f ^ h^G;G' 
Add G" to S 

if / = /i and G"; G" = G" then 

if there does not exist an arc of the form x \ x in G" then 
|_ return Not Terminating 

until S reaches closure 
return Terminating 



3.1. Ramsey-Based Containment with Supergraphs. To test the containment of a 
Biichi automaton ^ in a Biichi automaton B, we could construct the complement of B using 
either the Ramsey-based or rank-based construction, compute the intersection automaton 
of A and B, and search this intersection automaton for a lasso. With universality, however, 
we avoided directly constructing B by exploiting the structure of states in the Ramsey-based 



construction (see Lemma 2.7). We demonstrate a similar test for containment. 

Consider two automata, A = (S, Qy\_, Q^, p^, F4) and B = (S, Qq, Q^q, pb, Fq). When 
testing the universality of B, any word not in L{B) is a sufficient counterexample. To test 
L{A ) C L{B) we must restrict our search to the subset of T,^ accepted by A. In Section 



2.1, we defined a set Qb of 0-1 arc-labeled graphs, whose elements provide a family of w- 
languages that covers S'^ (see Lemma |2.5[ ). We now define a set, Qa,Bi which provides a 
family of w-languages covering L(A). 

We first define = Qa ^ Qa to capture the connectivity in Qa- An element g = 
{q, r) G Qa is a single arc asserting the existence of a path in A from q to r. With each arc 
we associate a language, L{g). 

Definition 3.1. Given w £ say that w G L{{q,r)) iff there is a path in A from g to r 
over w. 

Define Qa,B as Qa x Qb- The elements of Qa,Bi called supergraphs, are pairs consisting 
of an arc from Qa and a graph from Qq. Each element simultaneously captures all paths in 
B and a single path in A. The language L{{g,g)) is then L{g) n L(g). For convenience, we 
implicitly take g = {g, g), and say {q, a,r) £g when {q, a, r) G ^. Since the language of each 
graph consists of finite words, we employ the concatenation of languages to characterize 



infinite runs. To do so, we first prove Lemma 3.2, which simplifies the concatenation of 
entire languages by demonstrating an equivalence to the concatenation of arbitrary words 
from these languages. 

Lemma 3.2. IfueL{g), v e L{h), uv£L(k), and L{g) ■ L{h) Q L{k) , then L{g) ■ L(h) Q 

m 

Proof: Assume we have such an u and v. We demonstrate every word w G LCg) ■ L{h) 
must be in L(A;). If we expand the premise, we obtain tt; G {L(g)r]L(g)) ■ {L(h)r]L{h)). This 
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implies w must be in L{g) ■ L(h) a nd in L{g) ■ L{h). Next, we know that u € L{g), v £ L{h), 



and uv G L{k). Thus by Lemma 2.4, L{g) ■ L{h) C L(k), and w G L{k). Along with the 
premise L{g) ■ L{h) C L{k), we can now conclude w £ L[k) n L{k)^ which is L{k). □ 

The languages L(^), g £ Qa,b, cover all finite subwords of L{A). A subword of L{A) 
has at least one path between two states in Qa, and thus is in the language of an arc in 



Qa- Furthermore, by Lemma 2.4 this word is described by some graph, and the pair of the 
arc and the graph makes a supergraph. Unlike the case of graphs and S"*", the languages of 
supergraphs do not form a partition of L(A): a word might have multiple paths between 
states in A, and so be described by more than one arc in Qa- With them we construct 
the finite family of w-languages that cover L{A). Given g, h £ Qa,b^ let Z(g,h) be the 



w-language L{g) ■ L{h)^ . In analogy to Section 2.1, call Z{g,h) proper if: (1) Z{g,h) is 
non-empty; (2) g = {q,r) and h = {r,r) where q G and r G F4; (3) Lijj) ■ L{h) C L{g) 
and L(h) ■ L{h) C L{h). Call a pair of supergraphs {g,h) proper if Z{g,h) is proper. We 
note that Z{g,h) is non-empty if L{g) and L{h) are non-empty, and that, by the second 
condition, every proper Z{g, h) is contained in L{A). 

Lemma 3.3. Let A and B he two Biichi automata, and Qa,b the corresponding set of 
supergraphs. L{A) = [J{Z{g,h) \ ,g,h £ Qa,B, Z{g,h) is proper}. 



Proof: We extend the Ramsey argument of Lemma [2. 5| to supergraphs. 

Consider an infinite word w = a^ai... with an accepting run p = poPi-.- in A. As p is 
accepting, we know that pQ £ and pi £ Fa for infinitely many i. Since Fa is finite, at 
least one accepting state q must appear infinitely often. Let D IN he the set of indexes i 
such that Pi = q. 

We pause to observe that, by the definition of the languages of arcs, for every i £ D the 
word (To---0"i-i is in L((pQ,q)), and for every i,j £ D, i < j, the word C7j...cjj_i G L{{q,q)). 
Every language Z{'g, h) where g = {po, q) and g = {q, q) thus satisfies the second requirement 
of properness. 

In addition to restricting our attention the subset of nodes where pi = q, we further 

2 

partition D into /c = 3"" sets Di,...,Dk based on the prefix of w until that point, where 
there is a Di associated with each possible graph gi. By Lemma |2.4[ every finite word is in 
the language of some graph g. Say that i £ Di iS ao...cri-i £ L(gi). As k is finite, for some 
m Dm must be infinite. Let g = gm- 



Similarly, by Lemma 2.4 we can use the word w to define a partition of all unordered 
pairs of elements from Dm. This partition consists of k sets Ci, ...Ck, such that (i, j) G Ci 
iff o"j...cjj_i G L(gi). Without loss of generality, for {i,j) £ Ci, assume i < j. Ramsey's 
Theorem tells us that, given such a partition, there exists an infinite subset {^l,^2, •••} of 
Dm and a Cn such that {ij,ik) £ C'n for all pairs of distinct elements ij,ik- 

This is precisely to say there is a graph h so that, for every £ Cn, it holds that 

(Tjj. ...(Tj^_i G L(/i). Cn thus partitions the word w into 

= (To...crii_i, u'2 = crjj...crj2_i, W3 = ai^...aig-i, 

such that wi £ L(g) and Wi £ L{h) for i > 1. Let g = {{po, q),g) and let h = {{q, q), h). By 
the above partition of we know that w £ Z{g, h). 

We now show that Z(g,h) is proper. First, as w G Z{g,h), we know Z(g,h) is non- 
empty. Second, as noted above, the second requirement is satisfied by the arcs {po,Q) 
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and (q,q). Finally, we demonstrate the third condition holds. As cro...cTi_i G L(g) for 
every i G C„, we have that W1W2 G Lijj). Both wi and W1W2 are in L{{pQ,q)) and so 
wi, W1W2 G L{g). By the definition of the language of arcs, L{(po, q)) ■ L{(q, q)) C L{{pQ, q)). 



Thus by Lemma 3.2, we can conclude that L{g) ■ L{h) C LiTj). Next observe that as 
(7i...(7j_i G L{h) for every pair i,j G Cn, we have that t(;2tt'3 G L{h). As i(;2, tf2fi'3 are 
both in {q,q), it holds that W2, W2W3 G L {h). By the definition of the language of arcs. 



L{{q-,q)) ■ L{{q,q)) C L{{q,q)). By Lemma 3.2 we can now conclude L{h) • L{h) C L(h). 



Therefore Z{g, h) is a proper language containing □ 

Lemma 3.4. Let A and B he two Biichi automata, and Q_a^b the corresponding set of 
supergraphs. 

(1) For all proper Z{g,h), either Z{g,h) D L{B) = (/> or Z{g,h) C L{B). 

(2) L{A) C L{B) iff every proper language Z{g, h) C L{B). 

(3) Let g, h be two supergraphs such that Z{g, h) is proper. Z{g, h) C L{B) iff there exists 
q G Qg*, r G Qg, a G {0, 1} such that {q, a,r) £g and (r, 1, r) G h. 

Proof: Given two supergraphs g = {g,g) and h = {h,h), recall that Y{g,h) is the lo- 
language L{g) ■ L{h)^. Further note that L(g) C L(g) and L(h) C L(h), and therefore 
Z{g,h)CY{g,h). 



[l| Consider two supergraphs g, h. By Lemma 2.6 either Y{g,h) n L{B) = or Y{g,h) C 



L(i3). Since Z{g, h) C y(^, h), it holds that ^(5, h) n L(^) = or ^(y, h) C L(^). 
m Immediate from Lemma 13.31 and clause [H 



|3| By Lemma either Y{g,h) C L{B) or Y{g,h)r^L{B) = 0^ By Lemma H /i) C 



L(i3) iff a g, r and a exist such that {q,a,r) G 5 and (r, l,r) G h. Since Z{g,h) C Y{g,h), 
Z{g, h) C L(i3) iff such a q,r and a exist. □ 



In an analogous fashion to Section 2.1 , we can use supergraphs to test the containment 
of two automata, A and B. Search all pairs of supergraphs, g,h £ Qa,b for a pair that is 
both proper and for which there does not exist a g G Qg*, r G Qb;*^ £ {0) 1} such that 
{q,a,r) G g and (r, l,r) G h. Such a pair is a counterexample to containment. If no such 
pair exists, then L(A) C L{B). 



3.2. Composition of Supergraphs. Employing supergraphs to test containment faces 
difficulty on two fronts. First, the number of supergraphs is very large. Second, verifying 
properness requires checking language nonemptiness and containment: PSPACE-hard prob- 
lems. To address these problems we construct only supergraphs with non-empty languages. 
Borrowing the notion of composition from Section |2.3| allows us to use exponential space 
to compute exactly the needed supergraphs. Along the way we develop a polynomial-time 
test for the containment of supergraph languages. Our plan is to start with graphs corre- 
sponding to single letters and compose them until we reach closure. The resulting subset 
of Qa,Bj written Qj^jg, contains exactly the supergraphs with non-empty languages. In 
addition to removing the need to check for emptiness, composition allows us to test the sole 



14 



S. FOGARTY AND M. Y. VARDI 




Figure 7: The composition of a graph with itself. 



remaining aspect of properness, language containment, in time polynomial in the size of the 
supergraphs. We begin by defining the composition of simple graphs. 

Definition 3.5. Given two graphs g and h define their composition, written as 'g; h, as the 
graph 

{{q, 1, r) I q,r,s e Qb, {q, b, s) G g, {s, c, r) G /i, 6 = 1 or c = 1} 
U {{q, 0, r) \ q,r,s e Qb, {q, 0, s) € g, {s, 0, r) G h, and 

for all t G Qb, b,c £ {0, 1} if {q, a,t) £g and {t, b,r) G h then a = b = 0} 

Example 3.6. Figure [7] shows the composition of a simple graph with itself. Figure [2] is 
also illustrative, as the third graph is the composition of the first two. 

We can then define the composition of two supergraphs g = {{q,r),g) and h = {{r,s),h), 
written g;h, as the supergraph {{q, s),g] h) . To generate exactly the set of supergraphs with 
non-empty languages, we start with supergraphs describing single letters. For a containment 
problem L{A) C L{B), define the subset of Qa,b corresponding to single letters to be 
Q\b — {9 \ 9 ^ Qa,B, a £ T,, a £ L{g)}. For completeness, we present a constructive 
definition of Q^ g . 
Definition 3.7. 

Qab = {{i(l:r),g) I q G Q_A,r G />^(g,a), a G S, 

9 = {{q', 0, r') \q' £Qb\ Fb, r' G {pB{q' , a) \ Fb)} U 
{{q',l,r') I q' G Qb, r' G PB{q' ,a), q' or r' G Fb)]] 

We then define g to be the closure of Q\ ^ under composition. Algorithm 
DoubleGraphSearch, which we prove correct below, employs composition to check the con- 
tainment of two automata. It first generates the set of initial supergraphs, and then com- 
putes the closure of this set under composition. Along the way it tests properness by using 
composition. Every time it encounters a proper pair of supergraphs, it either verifies that 
a satisfying pair of arcs exist, or halts with a counterexample to containment. We call this 
search the double-graph search. 

To begin proving our algorithm correct, we link composition and the concatenation of 
languages, first for simple graphs and then for supergraphs. 

Lemma 3.8. For every two graphs g and h, it holds that L(g) ■ L{h) C L{g; h). 
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Algorithm 2: DoubleGraphSearch(^,B) 
Data: Two Biichi automata, A and B. 
Result: Whether L{A) is contained in L{B). 
Initiahze ^ 
repeat 

for all pairs g,h ^ where g = {q, r) and h = (r, s) do 
Add g]h to 

if g G Q^A' ^ ^ ^-A' s = r, g;h = g and h;h = h then 

if there do not exist {q, a,r) Gg and (r, 1, r) G h where q G Qg' then 
L return Not Contained 

until Q'^ reaches closure 
return Contained 



Proof: Consider two words wi G L{g), W2 G L{h). By Definition 2.2, to prove W1W2 G 
L(g;h) we must show that for every q,r G Q: both (1) {q,a,r) G 5; /i iff there is a path 
from g to r over W1W2, and (2) that a = 1 iff there is an accepting path. 

If an arc {q, a, r) G g ; h exists, then there is an s G Q such that {q, b, s) G g and 



(s, c, r) G h. By Definition 2.2, this imphes the existence of a path xis from q to s over u)!, 



and a path sx2 from s to r over W2- Thus X1SX2 is a path from g to r over wiW2- 

If a is 1, then either 6 or c must be 1. By Definition 2.2, b (resp., c) is 1 iff there is an 

accepting path x'^s (resp., 3X2) over wi (resp.,tt;2) from g to s (resp., s to r). In this case 

x'iSX2 (resp., X1SX2) is an accepting path in B from g to r over 'wiW2- 

Symmetricahy, if there is a path x from g to r over W1W2, then after reading wi we are 

in some state s and have split x int o xi sx2, so that xis is a path from q to s and SX2 a 



path from s to r. Thus by Definition 2.2 {q, b, s) G g, {s, c, r) G /i, and {q, a, s) G g; h. 

Furthermore, if there is an accepting path from g to r over W1W2, then after reading wi 
we are in some state s and have spht the path into xisx2, so that xis is a path from q to 
■s, a nd sx2 a path from s to r. Either xis or sx2 must be accepting, and thus by Definition 



2.2 



{q, b, s) G g, {s, c, r) G /i, and either b or c must be 1. Therefore a must be 1. □ 

Lemma 3.9. Lefg, h, k be supergraphs in g such that g = {q,r) , h = (r, s), and k = {q, s). 
Then g;h = k iff L{g) ■ L(h) C L(k). 

Proof: Assume g\h = k as a. premise. This imphes k = {{q, s},g;h)). If either L(g) or 
L{h) are empty, then L{g) ■ L{h) is empty and this direction holds triviahy. Otherwise, 
take two words u G Lijj), v G L{h). By construction, u G L{{q,r)) and v G L((r, s)). The 
definition of the languages of arcs therefore implies the existence of a path from q to r over 
u and a pat h fr om r to s over v. Thus uv G L{{q,s)). Similarly, u G -i^(5'), v G L {h), 
and Lemma 



3.8 



3.2 



implies that G L{k). Thus is in L{{(q,r), k)). and by Lemma 
L{g)-L{h)CL(k). 

In the other direction, if L{g) ■ L{h) C L(/c), we show that g;h = k. By definition, 
g; h is ((g, s),g; h). As g,h £ Q^^g) they are the composition of a finite number of graphs 
from Q\ig- The above direction then demonstrates that they are non-empty, and there is 
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a word w £ L{g) ■ L{h). This expa nds to w G {L{g) fl L[g)) ■ {L{h) Pi L{h)), whic h imphes 

w is the 



w G L{g) ■ L{h). By Lemma 3.8 w is then in L{g;h). Since, by Lemma 



2.4 



language of exactly one graph, we have that g;h = k, which proves g;h = k. □ 



Lemma 3.9 provides the polynomial time test for properness employed in Algorithm 
DoubleGraphSearch. Namely, given two supergraphs g = {{q,r),g) and h = {{r,r),h) from 
Qab^ the pair {g, h) is proper exactly when q E Q^, r S -F4, g;h = g and h;h = h. We now 

provide the final piece of our puzzle: proving that the closure of Q\]g under composition 
contains every non-empty supergraph. 



Lemma 3.10. For two Biichi automata A and B, every h £ Qa,B; where L{h) / 0, is in 



Proof: Let h = {{q,r),h) where L{h) 7^ 0. Then there is at least one wordw = (To...(T„_i G 
L{h), which is to say w £ L{{q,r)) n L{h). By the definition of the languages of arcs, there 
is a path p = pQ...pn in A over w such that po = q and p„ = r. 

Define ^o-- to be the graph in containing o-j. Let go-^ be { {pi,pi^i),go--), and let g, 
be gao'idcTi', 9(7n~i- Note that each g^^ G Q\b- Lemma 3.9 w G '(jw By Lemma 2.4 
w is in only one graph and 5^ = h. By construction, g^ = {q,r). Therefore {gw,gw) = 
((g, r), h) = h, and h is in the closure of ^ under composition. □ 

We can now show the correctness of Algorithm DoubleGraphSearch, using Lemma 



3.9 to justify testing properness with composition, and Lemma 3.11 below to justify the 



correctness and completeness of our search for a counterexample. 

Lemma 3.11. Let A and B be two Biichi automata. L(A) is not contained in L{B) iff 
Qj^ B contains a pair of supergraphs g, h such that {g, h) is proper and there do not exist 
arcs {q, a,r) £g and (r, l,r) £ h, q £ Q^^. 

Proof: As all proper graphs are non-empty, this follows from parts (2) and (3) of Lemma 
iH and Lemma iini □ 



Theorem 3.1. For every two Biichi automata A and B, it holds that L{A) Q L{B) iff 
DoubleGraphSearch returns Contained. 



Proof: By Lemma |3.9[ testing for composition is equivalent to testing for language con- 
tainment, and the outer conditional in Algorithm DoubleGraphSearch holds only for proper 
pairs of supergraphs. By Lemma 3.11 the inner conditional checks if a proper pair of su- 
pergraphs is a counterexample, and if no such proper pair in is a a counterexample 
then containment must hold. □ 



3.3. Strongly SufRx Closed Languages. Algorithm DoubleGraphSearch has much the 
same structure as Algorithm LJB. The most noticeable difference is that Algorithm 
DoubleGraphSearch checks pairs of supergraphs, where Algorithm LJB checks only sin- 



gle size-change graphs. Indeed, Theorem 2.1 suggests that, for some languages, a cycle 



implies the existence of a lasso. When proving containment of Biichi automata with such 
languages, it is sufficient to search for a graph h £ Qq, where h;h = h, with no arc (r, 1, r). 
This single-graph search reduces the complexity of our algorithm significantly. What enables 
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this in size-change termination is the late-start property: threads can begin at arbitrary 
points. We here define the class of automata amenable to this optimization, first presenting 
the case for universality testing, without proof, for clarity. 

In size-change termination, the late-start property asserts that an accepting cycle can 
start at an arbitrary point. Intuitively, this suggests that an arc (r, l,r) E h might not 
need a matching prefix (g, a, r) in some g: the cycle can just start at r. In the context of 
universality, we can apply this method when it is safe to add or remove arbitrary prefixes 
of a word. To describe these languages we extend the standard notion of suffix closure. A 
language L is suffix closed when, for every w £ L, every suffix of tt; is in L. 

Definition 3.12. A language L is strongly suffix closed if it is suffix closed and for every 
w £ L, wi £ S"^, we have that Wiw G L. 

Lemma 3.13. Let B be an Biichi automaton where every state in Q is reachable and L{B) is 
strongly suffix closed. B is not universal iff the set of supergraphs with non-empty languages, 
contains a graph h such that h; h and h does not contain an arc of the form (r, l,r). 



As an intuition for the correctness of Lemma 3.13, note that the existence of an 1-labeled 
cyclic arc in h implies a loop, that Q being reachable implies a prefix can be prepended to 
this loop to make a lasso, and that strong suffix closure allows us to swap this prefix for the 
prefix of every other word that share this cycle. 

To extend this notion to handle containment questions Li C L2, we restrict our focus 
to words in Li. Instead of requiring L2 to be closed under arbitrary prefixes, L2 need only 
be closed under prefixes that keep the word in Li. 

Definition 3.14. A language L2 is strongly suffix closed with respect to Li when L2 is suffix 
closed and, for every w £ Li Ci L2, wi £ S+, if wiw £ Li then wiw £ L2. 

When checking the containment of ^ in i3 for the case when L{B) is strongly suffix closed 
with respect to L{A), we can employ a the simplified algorithm below. As in Algorithm 
DoubleGraphSearch, we search all supergraphs in Q^jg- Rather than searching for a proper 
pair of supergraphs, however. Algorithm SingleGraphSearch searches for a single super- 
graph h where h;h = h that does not contain an arc of the form (r, 1, r). We call this search 
the single- graph search. 

We now prove Algorithm SingleGraphSearch correct. Theorem 3.2 demonstrates that, 
under the requirements specified, the presence of a single-graph counterexample refutes 
containment, and the absence of such a supergraph proves containment. 

Theorem 3.2. Let A and B be two Biichi automata where = Qa? every state in 
is reachable, and L{B) is strongly suffix closed with respect to L{A). Then L{A) is not 
contained in L{B) iff g contains a supergraph h = {{s, s), h) such that s £ Fj\^, h;h = h, 
and h does not contain an arc (r, l,r). 

Proof: In one direction, assume g contains a supergraph h = {{s, s), h) where s £ Fj^^, 
h;h = h, and there is no arc (r, 1, r) £ h. We show that Z{h, h) is a proper language not 
contained in L{B). As h £ Q^^g, we know L{h) is not empty, implying Z{h,h) is non- 



empty. As Qy^ = Q^, it holds that s £ Q^. By Lemma 3.9 the premise h;h = h implies 
L{h ) ■ L{h) C L{h), and Z(h, h) is proper. Finally, as there is no (r, l,r) £ h, by Theorem 
3.1, Z{h, h) L{B), and Z{h, h) is a counterexample to L{A) C L{B). 
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Algorithm 3: SingleGraphSearch(^,B) 
Data: Two Biichi automata, A and B. 

Require = Q^, Qb is reachable, and L{B) is strongly suffix closed w.r.t. L{A) 
Result: Whether L{A) is contained in L{B). 
Initialize ^ 
repeat 

for all pairs g,h £ where g = (g, r) and h = (r, s) do 
/c <^ ^; /i 
Add k to Qf 

if g G -F4, Q = s, and k;k = k then 

if there does not exist an arc (r, l,r) & k then 
L return Not Contained 

until Q-l^ reaches closure 
return Contained 



In the opposite direction, assume the premise that g does not contain a supergraph 
h = {{s, s), h) where s G F4, h;h = h, and there is no arc (r, 1, r) G /i. We prove that every 



word w G L{A) is also in L{B). Take a word t(; G -C/(^). By Lemma 3.3, w is in some proper 



language Z{g,h) and can be broken i nto W 1W2 where wi G L{g), W2 G 



Because Zijj, h) is proper, Lemma 3.9 implies h = {{s, s), h) where s G -F4 and h;h = h. 
This, along with our premise, implies h contains an arc (r, l,r). Since all states in are 
reachable, there is g G and u G E"*" with a path in from q to r over u. By Lemma 2.7 
this implies uti;2 is accepted by B. For L{B) to strongly sufhx closed with respect to L{A), 
it must be suffix closed. Therefore W2 G L{B). Now we move to L{A), and note that the 
premise Q^f = Q_a implies L{A) is suffix closed. Thus the fact that wi'W2 G L(A) implies 
W2 G L{A). Since is strongly suffix closed with respect to L{A), and W2 G L{B), it 

must be that W1W2 G □ 



3.4. Prom Ramsey-Based Containment to Size-Change Termination. We now delve 
into the connection between the LJB algorithm for size-change termination and the single- 
graph search algorithm for Biichi containment. We will show that Algorithm LJB is a special- 
ized realization of Algorithm SingleGraphSearch. Given an SCT problem L, size-change 
graphs in LJB(L) are direct analogues of supergraphs in SingleGraphSearch 
-^Desc(L))- For convenience, take L = {H,P,C,g), Afiow{L) = {C^QphQfi, Pfi, Fpi), and 

•^Desc(L) = {C,Qds,Q^Ds^ PDs, Fds)- 

We first show that Afiow{L) and Aj;)esc(L) satisfy the preconditions of Algorithm 
SingleGraphSearch: that Q^pi = Qpf, that every state in Qds is reachable; and that 
Desc{L) is strongly suffix closed with respect to Flow{L). For the first and second require- 
ment, it suffices to observe that every state in both Apiou}{L) ^-^id A]:)esc{L) is initialn 

For the third, strong suffix closure is a direct consequence of the definition of a thread: 
since a thread can start at arbitrary points, it does not matter what call path we use to 



'In the original reduction, 1-labeled parameters may not have been reachable. 
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a : f— g b : g— g 



<f,g> 




<g,g> 



®. 



























Figure 8: Size-Change Graphs vs. Supergraphs: The SCGs for call sites a and 6, from Figure |3 
and corresponding supergraphs for the characters a and b, from Q^t^i (i,) -4d (i) ' 



reach that point. Adding a prefix to a call path cannot cause that call path to become 
non-terminating. Thus the late-start property is precisely Desc{L) being strongly suffix 
closed with respect to Flow{L), and we can employ the single-graph search. 

Consider supergraphs in QAFiowiL)ADesc{L)^ from here simply denoted by Ql. The state 
space of Afiow{L) is the set of functions H, and the state space of Ar)esc(L) is the union of H 
and Qp, the set of all {0, l}-labeled parameters. A supergraph in Ql thus comprises an arc 
(g, r) in H and a {0, l}-labeled graph 'g over H U Qp. The arc asserts the existence of a call 
path from q to r, and the graph ^ captures the relevant information about corresponding 
paths in AoesciL)- 

These supergraphs are almost the same as SCGs, G : q ^ r (See Figure [s]). Aside from 
notational differences, both contain an arc asserting the existence of a call path between 
two functions, and a {0, l}-labeled graph. There are nodes in both graphs that correspond 
to parameters of functions, and arcs between two such nodes describe a thread between the 
corresponding parameters. The analogy falls short, however, on three points: 

(1) In SCGs, nodes are always parameters of functions. In supergraphs, nodes can be either 
parameters of functions or function names. 

(2) In SCGs, nodes are unlabeled. In supergraphs, nodes are labeled either or 1. 

(3) In an SCG, only the nodes corresponding to the parameters of two specific functions 
are present. In a supergraph, nodes corresponding to every parameter of every function 
exist. 

Each difference is an opportunity to specialize the Ramsey-based containment algo- 
rithm. Algorithm SingleGraphSearch, by simplifying supergraphs. When these specializa- 
tions are taken together, we have Algorithm LJB. 

[1] No functions in H are accepting for ^i:)esc{L)) s-'^d once we transition out of H into Qp 
we can never return to H. Therefore nodes r corresponding to function names can never 
be part of a descending arc (r, 1, r). Since we only search for arcs of the form (r, 1, r), 
we can simplify supergraphs in Q^. by removing all nodes corresponding to functions. 

[2] The labels on parameters are the result of encoding a Biichi edge acceptance condition 
in a Biichi state acceptance condition automaton, and can be dropped from supergraphs 
with no loss of information. Consider an arc ((/, a),6, {g,c)). If b is 1, we know the 
corresponding thread contains a descending arc. The value of c tells us if the final arc 
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in the thread is descending, but which arc is descending is irrelevant. Thus it is safe to 
simpHfy supergraphs in Ql by removing labels on parameters. 
[3] While all parameters have corresponding states in ADesc(L)-, each supergraph describes 
threads in a call sequence between two particular functions. There are no threads in 
this call sequence between parameters of other functions, and so no supergraph with a 
non-empty language has arcs between the parameters of other functions. We can thus 
simplify supergraphs in Ql by removing all nodes corresponding to parameters of other 
functions. 

To formalize this notion of simplification, we first define, Gl, the set of simplified 
supergraphs and show that is in one-to-one correspondence with S, the closure of Q 
under composition. 

Definition 3.15. Gl = {{{fij2),k) \ /i,/2 e H, k C 2^(/i)x{o,i}xP(/2)} 

Say that {r,g) G Ql simplifies to {r,k) G Gl when {q,b,r) G /c iff there exists a,c G 
{0, 1} such that {(q, a), b, (r, c>) G g. Let G]^ be {fc | ^ G Q\, g simplifies to k}, and G{ be 
the closure of G\^ under composition. 

We can map SCGs directly to elements of Gl- Say G : /i — )■ /2 = {{fi,f2),g) when 
q A' r G G iS {q,a,r) G g. Note that the composition operations for supergraphs of this 
form is identical to the composition of SCGs: if Gi = g and G2 = h, then Gi;G2 = g',h. 
Therefore every element of Qj^ simplifies to some element of G^. 

We now show that supergraphs whose languages contain single characters are in one- 
to-one correspondence with Q, and that every idempotent element of G^j^ contains an arc 
of the form(r, l,r) exactly when the closure of Q under composition does not contain a 
counterexample graph. 

Lemma 3.16. Let L = {H, P, C, Q) he an SOT problem. 

(1) The = relation is a one-to-one correspondence between G\ and Q 

(2) L is not size-change terminating iff contains a supergraph k such that k;k = k and 
there does not exist an arc of the form (r, l,r) in k. 

Proof: 

(1): Given a size-change graph G G ^, we construct a unique supergraph k £ G^ such 
that k = G. Every size-change graph G : /i — t- /2 G ^ is the SCG for a call site c from 
fi to f2- This is a call sequence of length one. Thus there is a 5 G so that c £ L{g) 
and g = (/i, /2). We show that the simplification of g is equivalent to G. By the reduction 
of Definition 2.12| and the definition of graphs in Definition 2.2 the arc {{q, b),a, (r, c)) G g, 
for some b,c £ {0,1}, exactly when q A- r £ G. The supergraph g simplifies to some 
k G Gl- By the definition of simplification, {{q,b),a,{r,c)) G g exactly when {q,a,r) G k- 
Thusfc = G:/i ^/2. 

In the other direction. A; G G^ iff there exists ag = {{fi, f2),g) G Ql that simplifies to k- 
which defines Q]^ and Definition 2.12| g exists because there is a call site 



3.7 



By Definition 

c. This call site corresponds to a SCG G : /i — )■ /2. Analogously to the above, by Definition 



2.12 the arcs in g between parameters correspond to arcs in G: {(q,b),a,(r,c)) G g, for 



some b,c £ {0,1}, exactly when q ^ r £ G- These are the only arcs that remain after 
simplification, during which the labels are removed. Thus A; = G : /i — )• /2. 
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(2): By (1), Q is in one-to-one correspondence with under the = relation. Since 
composition of supergraphs and SCGs is identical, S, the closure of Q under composition, 

claim (1). 



is in one-to-one correspondence with the G^. Claim (2) then follows from Theorem 



2.1 



and 

□ 



In conclusion, we can specialize the Ramsey-based containment algorithm for L(Afiow{l)) 



C L{A£)esc{L)) ill two ways. First, by Theorem 3.2 we know that Flow{L) C Desc{L) if and 



only if Ql contains an idempotent graph g = g;g with no arc of the form (r, 1, r). Thus we 
can employ the single-graph search instead of the double-graph search. Secondly, we can 
simplify supergraphs in Q^. by removing the labels on nodes and keeping only nodes asso- 
ciated with appropriate parameters for the source and target function. The simplifications 
of supergraphs whose languages contain single characters are in one-to-one corresponding 
with Q, the initial set of SCGs. As every state in Flow{L) is accepting, every idempotent 
supergraph can serve as a counterexample. Therefore Desc{L) C Flow{L) if and only if 
the closure of the set of simplified supergraphs, which is in one-to-one correspondence with 
Q, under composition does not contain an idempotent supergraph with no arc of the form 
(r, l,r). This is precisely Algorithm LJB. 



4. Empirical Analysis 



All the Ramsey-based algorithms presented in Section 2.3 have worst-case running times 
that are exponentially larger than those of the rank-based algorithms. We now compare 
existing, Ramsey-based, SCT tools tools to a rank-based Biichi containment solver on the 
domain of SCT problems. To facilitate a fair comparison, we briefiy describe two improve- 
ments to the algorithms presented above. 



4.1. Towards an Empirical Comparison. First, in constructing the analogy between 
SCGs in the LJB algorithm and supergraphs in the Ramsey-based containment algorithm, 
we noticed that supergraphs contain nodes for every parameter, while SCGs contain only 
nodes corresponding to parameters of relevant functions. These nodes are states in ^Desc(L)- 
While we can specialize the Ramsey-based test to avoid them, Biichi containment solvers 
might suffer. These states duplicate information. As we already know which functions 
each supergraph corresponds to, there is no need for each node to be unique to a specific 
parameter. 

These extra states emerge because Desc{L) only accepts strings that are contained in 
Flow{L), and in doing so demands that parameters only be reached by appropriate call 
paths. But the behavior of ^Desc(L) on strings not in Flow{L) is irrelevant to the question 
of Flow{L) C Desc{L), and we can replace the names of parameters in ^Desc(L) with their 
location in the parameter list. Further, we can rely on Flow{L) to verify the sequence of 
function calls before our accepting thread and make do with a single waiting state. As an 
example, see Figure |9j 
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Figure 9: AoesciL) (left), from the original reduction of Definition 
from the reduction of Definition 14.11 



2.12 



and A', 



Desc(L) 



(right) 



Definition 4.1. Given an SCT problem L = {H, P,C,G) and a projection Ar of all pa- 
rameters onto their positions l..n in the parameter list, define: 

■^DesciL) = {C,SU{qo},SU{qo},PD,F) 

where 5 = {l..n} x {1, 0} 

PD{qo,c) = {qo}U{{Ar{x),0) | c : /i ^ /a, x G ^(/s)} 

PD{{h,a),c) = {{Ar{y),a') \ x ^ y £ Gc, h = Ar{x)} 
F = {l..n} X {1} 



Lemma 4.2. L{Afiow{l)) ^ L{ADesc{L)) iff L{Afiow{l)) ^ ^(-4'^esc{L)) 

Proof: The languages of A]:)f>sc{L) and -4.'^^^^^^^ are not the same. What we demonstrate 
is that for every word in Flow{L), we can convert an accepting run in one of ^i:)esc(L) 
•^DesciL) ^^^^ accepting run in the other. Recall that the states of Apiouj(L) are functions 
f £ H. States of ADesc{L) are either elements of H or elements of Qp = U/ej? ^if) {!> 0}, 
the set of labeled parameters. For convenience, given a pair {x,a) G Qp, define Ar{{x,a)) 
to be {Ar{x),a). 

Consider an accepting run r = rori... of A£)esc{L) over a word w. Let s = sqSi... be 
the sequence of states in A'j^^^^^j^^ such that when £ H, Si = qq, and when G Qp, 
Si = Ar{ri). By the definition of A'j^^^^^j^y qo always transitions to qq and a transition 
between and rj+i implies a transition between Ar(ri) and Ar(ri^i). Therefore s is a 
run of -4'£)esc(L) ov^^ ^- Furthermore, if {x,a) is an accepting state in ^Desc(L)) « = 1 and 
{Ar{x),a) is an accepting state A'j^^^^^q. Thus, s is an accepting run of A'^^^^^^^^ over w. 

Conversely, consider a word w with an accepting run r = rQri... of -^£)esc(L) ™ 
accepting run s = sqSi... of Afiow(L)- We define an accepting run t = to^i-- of Aj:,esc(L) on 
w. Each ti depends on the corresponding rj and Sj. If = (70 and Sj = /, then ti = f. If 
Tj = (A;, a) and Sj = /, then ti = {x, a) where x is the kth parameter in /'s parameter list. 

For a call c : /i — )• /2, take two labeled parameters, q a labeled parameter of /i and 
r a labeled parameter of f2- If (Ar(g), c, ^r(r)) is a transition in A'^^^^^^y then {q,c,r) is 
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Figure 10: Subsumption: two graphs g and h, where g di h. 



a transition in ^Desc(L)- Therefore t is a run of ^Desc(L) on w. Furthermore, note that 
{x,a) £ ^AoesciL) ^^'^ {Ar{x),a) G ^AoesciL) iff o = 1- Therefore t is an accepting run. □ 

Second, in Ben-Amram and Lee present a polynomial approximation of the LJB al- 
gorithm for SCT. To facilitate a fair comparison, they optimize the LJB algorithm for SCT 
by using subsumption to remove certain SCGs when computing the closure under composi- 
tion. This suggests that the single-graph search of Algorithm SingleGraphSearch can also 
employ subsumption. When computing the closure of a set of graphs under compositions, 
we can ignore elements when they are approximated by other elements. Intuitively, a graph 
g approximates another graph h when it is strictly harder to find a 1-labeled sequence of 
arcs through g than through h. If we can replace h with g without losing arcs, we do not 
have to consider h. When the right arc can be found in g, then it also occurs in h. On the 
other hand, when g does not have a satisfying arc, then we already have a counterexample. 

Formally, given two graphs g,h G Qq say that g approximates h, written g ^ h, when 
for every arc {q,a,r) G g there is an arc {q,a',r) G h, a < a'. An example is provided 
in Figure 10 Note that approximation is a transitive relation. In order to safely employ 



approximation as a subsumption relation, Ben-Amram and Lee replace the search for a 
single arc in idempotent graphs with a search for a strongly connected component in all 
graphs. This was proven to be safe in |13j: when computing the closure of Q^^ under 
composition, it is sufficient to store only maximal elements under this relation. 



4.2. Experimental Results. All experiments were performed on a Dell Optiplex GX620 
with a single l.TGhz Intel Pentium 4 CPU and 512 MB. Each tool was given 3500 seconds, 
a little under one hour, to complete each task. 



Tools: The formal-verification community has implemented rank-based tools in order to 
measure the scalability of various approaches. The programming-languages community has 
implemented several Ramsey-based SCT tools. We use the best-of-breed rank-based tool, 
Mh, developed by Doyen and Raskin [9j, that leverages a subsumption relation on ranks. 
We expanded the Mh tool to handle Biichi containment problems with arbitrary languages, 
thus implementing the full containment-checking algorithm presented in their paper. 

We use two Ramsey-based tools. SCTP is a direct implementation of the LJB algorithm 
of Theorem 2.1 written in Haskell [15]. We have extended SCTP to reduce SCT problems 



to Biichi containment problems, using either Definition 2.12 or 4.1 sct/scp is an optimized 
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C implementation of the SCT algorithm, which uses the subsumption relation of Section 



4.1 m. 



Problem Space: Existing experiments on the practicality of SCT solvers focus on examples 
extracted from the literature [3] . We combine examples from a variety of sources [U HI [T5l 
[T71 [211 [28l |32]. The time spent reducing SCT problems to Biichi automata never took 
longer than 0.1 seconds and was dominated by I/O. Thus this time was not counted. We 
compared the performance of the rank-based Mh solver on the derived Biichi containment 
problems to the performance of the existing SCT tools on the original SCT problems. If 
an SCT problem was solved in all incarnations and by all tools in less than 1 second, the 
problem was discarded as uninteresting. Unfortunately, of the 242 SCT problems derived 
from the literature, only 5 prove to be interesting. 

Experiment Results: Table [T] compares the performance of the rank-based Mh solver 
against the performance of the existing SCT tools, displaying which problems each tool 
could solve, and the time taken to solve them. Of the interesting problems, both SCTP 
and Mh could only complete 3. On the other hand, sct/scp completed all of them, and had 
difficulty with only one problem. 



Problem 


SCTP (s) 


Mh (s) 


sct/scp (s) 


ex04 IA\ 


1.58 


Time Out 


1.39 


ex05 [4J 


Time Out 


Time Out 


227.7 


ms [15j 


Time Out 


0.1 


0.02 


gexgcd [15J 


0.55 


14.98 


0.023 


graphcolour2 jl7j 


0.017 


3.18 


0.014 



Table 1: SCT problem completion time by tool. 



The small problem space makes it difficult to draw firm conclusions, but it is clear 
that Ramsey-based tools are comparable to rank-based tools on SCT problems: the only 
tool able to solve all problems was Ramsey based. This is surprising given the significant 
diff'erence in worst-case complexity, and motivates further exploration. 

5. Reverse-Determinism 

In the previous section, the theoretical gap in performance between Ramsey and rank-based 
solutions was not refiected in empirical analysis. Upon further investigation, it is revealed 
that a property of the domain of SCT problems is responsible. Almost all problems, and 
every difficult problem, in this experiment have SCGs whose nodes have an in-degree of at 
most 1. This property was first observed by Ben-Amram and Lee in their analysis of SCT 
complexity j4j • After showing how this property explains the performance of Ramsey-based 
algorithms, we explore why this property emerges and argue that it is a reasonable property 
for SCT problems to possess. Finally, we improve the rank-based algorithm for problems 
with this property. 
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Algorithm 4: gcd(X ,Y) 

1 if y > X then 

2 lgcd{X,Y-X) 

3 else if X > y then 

4 I gcdiX -Y,Y) 

5 else return X 



As stated above, all interesting SCGs in this experiment have nodes with at most one 
incoming edge. In analogy to the corresponding property for automata, we call this property 
of SCGs reverse-determinism. Say that a SCO is reverse-deterministic if every parameter of 
/2 has at most one incoming edge. Given a set of reverse-deterministic SCGs Q, we observe 
three consequences. First, a reverse-deterministic SCG can have no more than n arcs: 
one entering each node. Second, there are only 2'-^("^°§") possible such combinations of n 
arcs. Third, the composition of two reverse-deterministic SCGs is also reverse-deterministic. 
Therefore every element in the closure of Q under composition is also reverse-deterministic. 
These observations imply that the closure of Q under composition contains at most 2*^("'°s"') 
SCGs. This reduces the worst-case complexity of the LJB algorithm to 2*^("^°s»^)^ the 
presence of this property, the massive gap between Ramsey-based algorithms and rank-based 
algorithms vanishes, helping to explain the surprising strength of the LJB algorithm. 

Lemma 5.1. When operating on reverse-deterministic SCT problems, the LJB algorithm 
has a worst-case complexity of 2^^'^^°^'^\ 

Proof: A reverse-deterministic SCT problem contains only reverse-deterministic SCGs. 
Observe that the composition of two reverse-deterministic SCGs is itself reverse-deterministic 
As there are only 2<^('^i°g") possible reverse-deterministic SCGs, the closure computed in 
the LJB algorithm cannot become larger than 2'^("'^°S") The LJB algorithm checks each 
graph in the closure exactly once, and so has a time complexity of 2'^('^^°§"). □ 

It is not a coincidence that all SCT problems considered possess this property. As noted 
in [5| , straightforward analysis of functional programs generates only reverse-deterministic 
problems. In fact, every tool we examined is only capable of producing reverse-deterministic 
SCT problems. To illuminate the reason for this, imagine a SCG G : f ^ g where / has 
two parameters, x and y, and g the single parameter z. If G is not reverse deterministic, 
this implies both x and y have arcs, labeled with either or 1, to z. This would mean that 
z's value is both always smaller than or equal to x and always smaller than or equal to y. 

The program in Algorithm [4] can produce non-reverse-deterministic size-change graphs, 
and serves to demonstrate the difficult analysis required to do scQ Consider the SCG for 
the call on lineO It is clear there should be a 0-labeled arc from X to X. To reach this 
point, however, we must satisfy the inequality on line[l} Therefore we can also assert that 
Y > X, and include a 1-labeled arc from y to A. This is a kind of analysis is difficult to 
make, and none of the size-change analyzers we examined were capable of detecting this 
relation. 



This example emerges from the Terminweb experiments by Mike Codish, and was translated into a 
functional language by Amir Ben-Amram and Chin Soon Lee. The authors are grateful to Amir Ben-Amram 
for bringing this illustrative example to our attention. 
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5.1. Reverse Determinism and Rank-Based Containment. Since the Ramsey-based 
approach benefited so strongly from reverse-determinism, we examine the rank-based ap- 
proach to see if it can similarly benefit. As a first step, we demonstrate that reverse- 
deterministic automata have a maximum rank of 2, dramatically lowering the complexity 
of complementation to 2*^^"^. We note, however, that given a reverse-deterministic SCT 
problem L,i the automaton A£)esc{L) is not reverse-deterministic. Thus a separate proof is 
provided to demonstrate that the rank of the resulting automata is still bounded by 2. 

An automaton is reverse-deterministic when no state has two incoming arcs labeled 
with the same character. Formally, an automaton is reverse-deterministic when, for each 
state q and character a, there is at most one state p such that q E p{p, a). As a corollary to 



Lemma 5.1, the Ramsey-based complementation construction has a worst-case complexity 
of 2<^("^°s^ for reverse deterministic automata With reverse-deterministic automata, we do 
not have to worry about multiple paths to a state. As a consequence, a maximum rank of 
2, rather than 2n — 2, suffices to prove termination of every path, and the worst-case bound 
of the rank-based construction improves to 2^^'^\ 

Theorem 5.1. Given a reverse-deterministic Biichi automaton B with n states, there exists 
an automaton B' with 2*-^(") states such that L(B') = L{B). 

Proof: In a run DAG Gw of a reverse-deterministic automaton, all nodes have only one 
predecessor. This implies the run DAG is a tree, and that the number of infinite paths 
grows monotonically and at some point stabilizes. Call this point k. If is rejecting, we 
demonstrate that there is a point j > k past which all accepting states are finite in G^. 
Observe that each infinite path eventually stops visiting accepting states. Let j be the last 
such point over all infinite paths, or k, whichever is greater. Past j, consider a branch off 
this path containing an accepting state. This branch cannot be a new infinite path, as the 
number of infinite paths is stable. This branch cannot lead to an existing infinite path, 
because that would violate reverse determinism. Therefore this path must be finite, and 
the accepting state is finite. 

Recall that G^(0) is G^, G^(l) is G^(0) with all finite nodes removed, G^(2) is G^(l) 
with all F-free nodes removed, and Gyj{'i) is G^(2) with all finite nodes removed. Because 
there are no infinite accepting nodes past j, G^(l) has no accepting nodes at all past j. 
Thus every node past j is F-free in G,i,(l), and G,«(2) has no nodes past j. Thus Gu,(3) 
is empty, and the DAG has a rank of at most 2. We conclude that the maximum rank 



of rejecting run DAG is 2, and the state space of the automaton in Definition 2.8 can be 



restricted to level rankings with no ranking larger than 2. □ 



Unfortunately, neither the reduction of Definition 2.12 nor the reduction of Definition 



4.1 preserve reverse determinism, which is to say that given a reverse-deterministic SCT 
problem, they do not produce a reverse-deterministic Biichi containment problem. However, 
we can show that, given a reverse-deterministic SCT problem, the automata produced by 
Definition 14.11 does have a maximum rank of 2. A similar claim could be made about 



Definition 2.12| with minor adjustments. 

Formally, we prove that for every reverse-deterministic SCT problem L, -4.^^^^^^^^ has 
a maximum rank of 2. Let w be an infinite word ciC2... not in -^(-^£)esc(L))' ^"^^ 
rejecting run DAG of A'j^^^^^^^ on w. There are two kinds of states in A'^^_,^^j^y There is 
a waiting state, qo, which always transitions to itself, and there are two states for every 
variable position h £ l..n, {h, 0) and {h, 1). Every state is an initial state. Consider G^, the 
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Figure 11: An overapproximation of the run DAG for where the maximum arity of 

functions in L 2. For clarity, this figure includes unreachable states. By definition, however, the 
DAG has only nodes that can be reached from a node on the first level. 



run DAG of ■^£,^g^(^^ on a word cqCi.... Each character Cj represents a function call from 
some function /j to another function /i+i. At level i of the run DAG, the waiting state has 
outgoing edges to itself and the positions of 0-labeled parameters of fi+i- Each variable 
state only has outgoing edges to a or 1-labeled position. To get an idea of what the run 
DAG looks like, Figure [TT] displays a supergraph of the run DAG that includes all states at 
all levels, even if they are not reachable. 

We now prove that the rejecting run DAG has a maximum rank of 2. To do so, 
we analyze the structure of by first examining subgraphs, and then extending these 
observations to Gyj. Let G^ be the subgraph of the run DAG that omits the waiting state 
(?o at every level of the run DAG. Every path in G^ corresponds to a (possibly finite) thread 
in the call sequence ciC2 . . .. 

Lemma 5.2. For a level i, f £ F, and x G P{f), o,t most one of {{Ar{x),0) ,i) and 
({Ar{x), has incoming edges in G^. 

Proof: For i = 0, this holds trivially. For i > 0, take a pair of nodes {{h,0),i) and 
{{h, 1), i). The edges from level i — 1 correspond to transitions in A'^^^^^^-^ on some call c. 
As c is a call to a single function, we know there is a unique variable x such that h = Ar{x). 
Because L is reverse deterministic, we know that there is at most one edge in Qc leading to 
X. If there is no edge, then there are no edges entering {{h, 0),i) or {{h, 1), i). 

Otherwise there is exactly one edge in Qc, y x, r G {0, 1}. In this case, the only 
nodes in level G^ with an edge to either ((/i, 0),i) or {{h,l),i) are {{Ar{y),0),i — 1) and 
{{Ar{y),l) ,i — 1). By the transition function p^, both of these states transition only to 
{{h,r),i), and only {{h,r),i) has incoming edges in G^. □ 
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Now, define to be the subgraph of G'^ containing only nodes with an incoming edge 
in G'^. This removes nodes whose only incoming edge was from qq. While this excludes 
nodes that begin threads, this cannot change the accepting or rejecting nature of a thread. 

Lemma 5.3. 

(1) G'^ is a forest. 

(2) Every infinite path in G'^ appears in G'^. 

(3) Every accepting node in G^ is also in G'^. 

Proof: 



(1) : Lemma 5.2 implies, for every h and i, that only one of ((/i, and {{h,0),i) is 
in G'^. Combined with the fact that L is reverse deterministic, every node in G'^ can have 
at most one incoming edge, and thus it is a forest. 

(2) : For every infinite path in G^, all nodes past the first have an incoming edge from 
G'^. Every node with an incoming edge from G'^ is in G'^. Thus for every infinite path in 
G'^, a corresponding path, perhaps without the first node, occurs in G'^. 



(3): Only nodes of the form {{h,l),i) are accepting. Let ((/i, be a accepting node. 
By the definition of a run DAG, {{h, l),i) must be reachable if it is in Gw Thus there is 
an edge from another node in to {{h,l),i). By the transition function , the waiting 
state {qo,i — 1) only has an edge to ((/i, 0),i). Therefore the only nodes that have an edge 
to {{h, are nodes of the form {{h',r),i — 1). All nodes of this form are in G^, and 
therefore {{h, 1), i) has an incoming edge from G^, and is in G^. □ 

We can now make observations about the rejecting run DAGs of -^^^^^^^-j that mirror 
those made about rejecting run DAGs of reverse deterministic automata. 

Lemma 5.4. There exists j G IN where for all i > j, G'^ has no infinite accepting nodes at 
level i. 

Proof: As G^ is a forest, the number of infinite nodes in level i + 1 cannot be smaller than 
the number of infinite nodes in level i. Thus at some level k the number of infinite nodes 
reaches a maximum. Past level k, each infinite node has a unique infinite path through G^. 
As Gw is rejecting, every infinite path eventually stops visiting accepting nodes at some 
level. Let j be the last such point over all infinite paths, or k, whichever is greater. Past 
j, consider an accepting node v that branch off an infinite path. This branch cannot be 
part of the existing infinite path, as this path has ceased visiting accepting nodes. Likewise, 
this branch cannot be part of a new infinite path, as the number of infinite paths can not 
increase. Therefore v must be finite. □ 

Lemma 5.5. G^(3) is empty. 

Proof: Let j be the level past which there are no infinite accepting nodes in G'^, as per 



Lemma 5.4 This precisely means that, past j, every accepting node in G'^ has a finite 
path. As all accepting nodes in G^ j every reachable accepting node in Gw 

has a finite path. After level j, Gi^(l) contains only non-accepting nodes. This implies that 
Gu,(2) contains no nodes past j, and therefore that Gtt,(3) is empty. □ 
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Theorem 5.2. Given a reverse-deterministic SCT problem L with maximum arity n, there 
is an automaton B' with at most 2^^"^^ states such that L{B') = -^(-^£)esc(L))' 

we know that every rejecting run DAG of -4.^^^^/^^-, has a maximum 



Proof: By Lemma 5.5 
rank of 2. Therefore it suffices to restrict the rank in Definition 2.8 to 2, replacing all 
occurrences of 2n — 2 in the definition with 2. The resulting automata is of size 0(2^"'). 



□ 



5.2. Experiments revisited. In light of this discovery, we revisit the experiments and 
again compare rank and Ramsey-based approaches on SCT problems. This time we tell 
Mh, the rank-based solver, that the problems have a maximum rank of 2. Table [2] compares 
the running time of Mh and sct/scp on the five most difficult problems. As before, time 
taken to reduce SCT problems to automata containment problems was not counted. 



Problem 


Mh (s) 


sct/scp (s) 


ex04 


0.01 


1.39 


ex05 


0.13 


227.7 


ms 


0.1 


0.02 


gexgcd 


0.39 


0.023 


graphcolour2 


0.044 


0.014 



Table 2: SCT problem completion time times by tool, exploiting reverse-determinism. 



While our problem space is small, the theoretical worst-case bounds of Ramsey and 
rank-based approach appears to be reflected in the table. The Ramsey-based sct/scp com- 
pletes some problems more quickly, but in the worst cases of ex04 and ex05, sct/scp per- 
forms significantly more slowly than Mh. It is worth noting, however, that the benefits of 
reverse-determinism on Ramsey-based approaches emerges automatically, while rank-based 
approaches must explicitly test for this property in order to exploit it. 

5.3. Monotonicity Constraints: Termination Problems Lacking Reverse-Deter- 
minism. Monotonicity constraints [8] are a generalization of size-change graphs. While an 
SCG for a call from f to g is bipartite, with edges only from variables of / to variables 
of g, monotonicity constraints allow edges between any two variables, even of the same 
function. In addition, while SCGs only have edges representing less than and less than 
or equal relations, monotonicity constraints allow edges representing equality relations. A 
collection of monotonicity constraints is called a monotonicity constraint system (MCS). 
For a formal presentation, please see [3]. 

Deciding termination for MCS problems is more involved than for SCT problems, but 
correctness similarly relies on Ramsey's Theorem [8]. One method is to reduce a MCS 
to an SCT problem through elaboration |3j. Unfortunately, elaboration is an exponential 
reduction, and increases the size of the MCS. Alternatively, it is possible to project an 
individual monotonicity constraint into an SCG in a lossy fashion. To do so, simply remove 
all edges that are not from a variable of / to a variable of g, and replace equality edges with 
less-than-or-equal edges. By projecting every monotonicity constraint in an MCS down to 
a SCG, we obtain a SCT problem. Doing so, however, often removes valuable information 
that can still be encoded in a size-change graph. To preserve this information, new arcs 
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that are logically implied by existing arcs can be added to the monotonicity constraint 
before the constraint is projected to a SCG. The simplest implied arcs are those derived 
from equality edges: given two arcs x y and x — )• x', add x' A y to the monotonicity 
constraint. Similarly, given an arc y y' , add x — t- y'. More complex implied arcs can be 
computed by similarly composing other arcs. 

We obtained a corpus of 373 monotonicity constraint systems from [7\. In each case, 
we produced three SCT problems from each MCS: one from directly projecting, one by 
computing arcs implied by equality before projecting, and one by computing all implied 
arcs before projecting. We again defined a problem to be interesting if either sct/scp or Mh 
took more than 1 second to solve the problem. For every interesting problem, there was no 
difference in result and no significant difference in running time between the two types of 
implied arcs. Thus we consider only the third, most complex, SCT problem generated from 
each MC problem, resulting in nine final problems. 

None of the interesting SCT problem produced in this fashion were reverse deterministic. 
Given the complexity of monotonicity constraints, this is perhaps unsurprising. Four of the 
resulting problems were non-terminating. For these problems, the maximum rank can be 
computed. To do so, Mh is initially limited to a rank of 1, and the rank is increased until 
Mh can detect non-termination. Table [3] displays the results for these problems. Despite the 
lack of reverse-determinism, none of these problems proved difficult for sct/scp: consuming 
at most 0.4 seconds. However, several were difficult for Mh, including one that took over 
eight minutes. In cases where we could bound the rank, the running time for Mh often 
improved dramatically. While we again have only a sparse corpus of interesting problems, 
these results serve to emphasize the importance of reverse determinism. Perhaps more 
interestingly, they suggest that, even in cases where reverse determinism does not hold, the 
Ramsey-based approach performs well. 



Problem 


rank 


Mh (s) 


sct/scp (s) 


Test3 


N/A 


4.44 


0.047 


Test4 


N/A 


4.65 


0.079 


Test 5 


N/A 


111.8 


0.074 


Test6 


N/A 


482.0 


0.097 


WorkingSignals 


13 


1.32 (1.0) 


0.098 


Gauss 


3 


1.10 (0.08) 


0.146 


PartitionList 


3 


1.38 (0.22) 


0.081 


Sudoku 


5 


7.18 (2.42) 


0.405 



Table 3: MC problem, maximum rank for non-terminating problems, and completion times by tool. 
Times for Mh in parenthesis are times when given the maximum rank, as if it were precomputed. 



6. Conclusion 

In this paper we demonstrate that the Ramsey-based size-change termination algorithm 
proposed by Lee, Jones, and Ben-Amram |24] is a specialized realization of the 1987 Ramsey- 
based complementation construction [Sj [29] . With this link established, we compare rank- 
based and Ramsey-based tools on the domain of SCT problems. Initial experimentation 



BUCHI COMPLEMENTATION AND SIZE-CHANGE TERMINATION * 



31 



revealed a surprising competitiveness of the Ramsey-based tools, and led us to further 
investigation. We discover that SCT problems are naturally reverse-deterministic, reducing 
the complexity of the Ramsey-based approach. By exploiting reverse determinism, we were 
able to demonstrate the superiority of the rank-based approach. 

Our initial test space of SCT problems was unfortunately small, with only five inter- 
esting problems emerging. Despite the very sparse space of problem, they still yielded two 
interesting observations. First, subsumption appears to be critical to the performance of 
Biichi complementation tools using both rank and Ramsey-based algorithms. It has al- 
ready been established that rank-based tools benefit strongly from the use of subsumption 
[9]. Our results demonstrate that Ramsey-based tools also benefit from subsumption, and 
in fact experiments with removing subsumption from sct/scp seem to limit its scalability. 
Second, by exploiting reverse determinism, we can dramatically improve the performance 
of both rank and Ramsey-based approaches to containment checking. 

Reverse determinism, however, is not the whole story in comparing the rank and Ramsey 
based approaches. On a separate corpus of problems derived from Monotonicity Constraints, 
which are not reverse-deterministic, the Ramsey-based approach outperformed the rank- 
based approach in every interesting case. It should be noted that, in addition to reverse 
determinism, there are several ways to achieve a better bound on the maximum rank than 
2n — 2 \16\ I18j . even for problems that are not known to be non-terminating. The rank- 
based approach might prove more competitive if such analyses were applied before checking 
containment. None the less, it is clear that despite the theoretical differences in complexity, 
we cannot discount the Ramsey-based approach. The competitive performance of Ramsey- 
based solutions remains intriguing. 

In OEHI) a space of random automata universality problems is used to provide a diverse 
problem domain. Unfortunately, it is far more complex to similarly generate a space of ran- 
dom SCT problems. First, universality involves a single automaton: SCT problems check 
the containment of two automata, with a corresponding increase in parameters. Worse, 
there is no reason to expect that one random automaton will have any probability of con- 
taining another random automaton. Sampling this problem space is further complicated by 
the low transition density of reverse-deterministic problems: in [9l [30] the most interesting 
problems had a transition density of 2. 

On the theoretical side, we have extended the subsumption relation present in sct/scp. 
Recent work has extended the subsumption relation to the double-graph search of Algorithm 
DoubleGraphSearch, and others have improved the relation through the use of simulation 
[21 . Doing so has enabled us to compared Ramsey and rank-based approaches on the 
domain of random universality problems [14J, with promising results. Future work will in- 
vestigate how to generate an interesting space of random containment problems, addressing 
the concerns raised above. 

The effects of reverse-determinism on the complementation of automata bear further 
study. Reverse-determinism is not an obscure property, it is known that automata derived 
from LTL formula are often reverse-deterministic [12]. As noted above, both rank and 
Ramsey-based approaches improves exponentially when operating on reverse-deterministic 
automata. Further, Ben-Amram and Lee have defined SCP, a polynomial-time approxima- 
tion algorithm for SCT [4J. For a wide subset of SCT problems with restricted in degrees, 
including the set used in this paper, SCP is exact. In terms of automata, this property is 
similar, although perhaps not identical, to reverse-determinism. The presence of an exact 
polynomial algorithm for the SCT case suggests a interesting subset of Biichi containment 
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problems may be solvable in polynomial time. The first step in this direction would be to 
determine what properties a containment problem must have to be solved in this fashion. 
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